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Abstract 

In wireless access network optimization, today's main challenges reside in traffic offload and in the 
improvement of both capacity and coverage networks. The operators are interested in solving their 
localized coverage and capacity problems in areas where the macro network signal is not able to serve 
the demand for mobile data. Thus, the major issue for operators is to find the best solution at reasonable 
expanses. The femto cell seems to be the answer to this problematic. In this worl£] we focus on the 
problem of sharing femto access between a same mobile operator's customers. This problem can be 
modeled as a game where service requesters customers (SRCs) and service providers customers (SPCs) 
are the players. 

This work addresses the sharing femto access problem considering only one SPC using game theory 
tools. We consider that SRCs are static and have some similar and regular connection behavior. We 
also note that the SPC and each SRC have a software embedded respectively on its femto access, user 
equipment (UE). 

After each connection requested by a SRC, its software will learn the strategy increasing its gain 
knowing that no information about the other SRCs strategies is given. The following article presents 
a distributed learning algorithm with incomplete information running in SRCs software. We will then 
answer the following questions for a game with N SRCs and one SPC: how many connections are nec- 
essary for each SRC in order to learn the strategy maximizing its gain? Does this algorithm converge to 
a stable state? If yes, does this state a Nash Equilibrium and is there any way to optimize the learning 
process duration time triggered by SRCs software? 

Keywords-component: game theory, sharing femto access, Nash Equilibrium, distributed learn- 
ing algorithm, stable state. 



1 Introduction 

Today, one of the biggest issues for Mobile Operators is to provide acceptable indoor coverage for wireless 
networks. Among the several in-building solutions, the femto cell is the one which is gaining significant 
interest. A femto cell is a small cellular base station characterized by low transmission power, limited access 
to a closed user group designed for residential or small business use. But its expansive buying cost is not 
motivating to purchase it. This solution could help operators solve localized coverage problems and extend 
their network. Indeed, in some areas where the macro network signal is weak, a network of open femto 
cells access would significantly improve the voice quality and data connectivity. This would be feasible if 
access points owners accept to be part of a Club where each member is willing to open up its access point 
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to the other members. This idea of sharing part of its bandwidth started with FOr>Q a club where members 
share their WiFi connection and inspired us to propose a Club where members share their femto accesses 
with bandwidth guarantees. A femto club member could share its 3G/LTE signal securely with other club 
members. Incentives for an owner of an access point to be member of such a club can be not only to share 
a part of the cost of the access point but also to make advertisement or to share some information through 
a specific social network associated to the club. These incentives would logically lead the club to manage 
by itself only such bandwidth exchange, but since this technology uses licensed spectrum, the only model 
that could for the moment be adopted for sharing femto access is the one where the mobile operator is 
also participating. Sharing femto access is a service proposed by the Mobile Operator to its clients. These 
customers are divided into service providers customers (SPCs) and service requesters customers (SRCs) : 
SPCs are the owners of femto cells accesses for which they have contracted with a Mobile operator denoted 
by MO. SRCs are customers using a mobile terminal in an area covered by some SPCs access points and 
requesting to use these access points. Note that a user can be both a SPC and a SRC. Dynamic femto 
spectrum sharing is a challenging problem for all the actors. Indeed the amount of requested bandwidth by 
SRCs as well as the amount of bandwidth shared by SPCs and pricing should be determined such that the 
utility of all the agents is maximized. Since the interests of all the actors could be antagonist, especially 
between many SRCs requesting a same SPC, we model our system as a game to determine equilibria of 
such situations. 



Related works 

The problem of sharing bandwidth and pricing has been already addressed by Dusit Niyato et al O, then 
modeled as a game. The challenging problem in this context is that bandwidth sharing requires a peaceful 
co-existence of both primary and secondary users. The femto access sharing we present in this article is 
similar but takes also into account both of SRCs and SPCs profiles. 

The potential games introduced by Rosenthal [8] are classical games having at least one pure Nash 
equilibrium. These games have a potential function such that each of its local optimums corresponds to a 
pure Nash equilibrium. This property has been used for congestion game in general (see for a survey), 
with Resource Reuse in a wireless context (see U) and for a real-time spectrum sharing problem with QoS 
provisioning ifTUll . 

A decentralized learning algorithm of Nash equilibria in multi-person stochastic games with incomplete 
information has been presented by M.A.L. Thathachar et all. In the considered game, the distribution of the 
random payoff is unknown to the players and further none of the players know the strategies or the actual 
moves of other players. It is proved that all stable stationary points of the algorithm are Nash equilibrium 
for the game [7]. The study presented in this article will use this algorithm in the game restricted to SRCs 
where each SRC will learn the strategy maximizing its gain using only local information. We will check 
whether if the stationary point the algorithm converges to is a pure Nash equilibrium. 



Our contribution. Section [2] presents the model for sharing femto access and the actors involved in this 
model are described. The game considering only SRCs competition is presented in Section [3] Section [4] 
details the principle of a distributed algorithm used to learn the game NE if any exists and some simulations 
results are given in Section 4.4 Finally, Section[5]draws a general conclusion and gives some perspectives. 

2 FON is a for-profit company incorporated and registered in the U.K. FON was created in Madrid, Spain, by Martin Varsavsky, 
an Argentine/Spanish entrepreneur and founder of many companies in the last 20 years. 
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2 Model for sharing femto access 



Now, we present the model for sharing femto access. Our system is composed of SPCs and SRCs. SPCs 
share some amount of bandwidth with SRCs. We assume that there exists a Token Based Accounting 
Protocol to exchange services between SRCs and SPCs against tokens [3]. On one hand, the paradigm for 
spectrum sharing should guarantee a certain access speed and a data transmission speed. On the other hand, 
the model should propose a type of connection (that would be more expansive than others) which guarantees 
QoS to SRCs: connections belonging to this type will never be canceled by the SPC. 

Our work considers a unique SPC denoted by X and several SRCs. We assume that the SPCs resource 
reserved for sharing is an amount of bandwidth denoted by Bs(X). Then, we will consider that each SRC 
Y requests connection from X. Actually, the bandwidth B$(X) of SPC X is divided into two parts: 

B s (X) = B Sg (X) + B Sy (X) 

• Bs G (X) is the part of bandwidth in which SRCs communications can never be preempted. It is kind 
of a guaranteed QoS allocated to SRCs. 

• Bs Y {X) is the part of bandwidth in which SRCs communicatiosn can be preempted. This preemption 
is due to the fact that the SPC has priority on this part of bandwidth. Thus, a communication allocated 
in Bs Y (X) is characterized by a risk of preemption. 

Figure [T] introduces the sharing bandwidth process and the billing process in both cases of a Green and 
a Yellow connection allocation: let's consider a SRC Y who needs an amount of bandwidth equal to bw. bw 
will be allocated to Y only if it is free. If X will need bw, he will not be able to use it if the connection he 
allocated to Y is green. However, he will be able to preempt the connection allocated to Y if it is a yellow 
one. 

We assume that there exists a Token Based Accounting Protocol to exchange services between SRCs 
and SPCs against tokens (representing money) 0. The billing process depends on whether the SRC has 
used a green connection or a yellow one. 

In the case of a Green connection, the SRC will spend Ni tokens corresponding to the used bandwidth 
bw. In the case of a Yellow connection, the SRC will spend N2 (N2 < N%) tokens if the connection succeed. 
Indeed, the yellow connection is cheaper than the green one due to the risk of preemption. If the yellow 
connection given to the SRC has failed, this connection will be free. 

2.1 Actors Description 

In the following section, we will describe the actors involved in the model that we have just presented. 
2.1.1 SPC Actor 

A SPC is a customer of the MO. SPC proposes to share an amount of its bandwidth with SRCs for a 
price per bandwidth unit which depends on the type of connection (Green, Yellow). The bandwidth split 
into Green and Yellow parts is determined following its sensitivity to Gain denoted by fj, € [0, 1] and its 
sensitivity to its own connection QoS denoted by F € [0, 1]. These two parameters are dual: fj, + T = 1. 

The Gain sensitivity parameter indicates its sensitivity degree to the price of the connection shared while 
the QoS sensitivity parameter indicates the SPCs tolerance degree towards preemption risk. The more a SPC 
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Figure 1: Bandwidth sharing process and billing process 



is sensitive to gain, the bigger its green part bandwidth will be because of its expansive selling price. The 
more a SPC is sensitive to QoS, the bigger its Yellow Part bandwidth will be since he does not want to 
be preempted by SRCs. When p, > 1/2, we say that the SPC is sensitive to gain. Otherwise, the SPC is 
considered as sensitive to its access QoS. 

2.1.2 SRC Actor 

A SRC is also a customer of the MO in need of QoS characterized by a QoS sensitivity parameter a and 
a price sensitivity parameter (3. The SRC wants to use the SPC's resources. The QoS sensitivity parameter 
indicates the SRC's tolerance degree towards the QoS degradation while the price sensitivity parameter 
indicates the SRC's tolerance degree towards the cost of the connection. These two parameters are dual: 

a + /3 = 1. 

The more one SRC is sensitive to QoS, the less its tolerance degree towards the QoS degradation will 
be. The more one SRC is sensitive to price, the less he is able to pay for the connection. When a > 1/2, 
the SRC is considered as sensitive to QoS. Otherwise, the SRC is sensitive to price. 

2.1.3 Interaction between the SPC actor and SRCs actors 

In general context, SRCs will request for some amount of bandwidth from the SPC depending of their 
profiles. The SPC will treat the SRCs requests for a fixed bandwidth split. 

1. For SRCs, the utility depends on their requests, the other SRCs requests as well as the SPC's decision 
(bandwidth allocated and type of connection) for a fixed SPC's bandwidth split. So, the SRC's utility 
depends mainly on its competition with other SRCs to receive some bandwidth from the same SPC. 

2. For the SPC, the utility depends on its bandwidth split for fixed SRCs requests. Since we consider 
only one SPC, no competition is needed for the SPC to rise its utility. 

In our work, we will only focus on the competition between SRCs when the SPC's bandwidth split is 
fixed. 
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3 Game presentation 



Since only the competition between SRCs is considered, sharing femto access problem could be thus mod- 
eled as a game where N SRCs are the players. We also consider that the SPC's bandwidth split is fixed. This 
could be motivated by the fact that learning methods are reliable only when the surrounding environment is 
invariant. We do not consider the mobility of SRCs. However, we assume that SRCs have some regular and 
similar connection behavior: each SRC requests the SPC's connection in nearly same time slots with almost 
invariant needs in terms of QoS. This means that requesting the SPC's femto access becomes almost routine 
for SRCs. This could be seen as repeated games. 

Along requested connections, each SRC will learn, thanks to an algorithm running in a software embed- 
ded in its user equipment, the best strategy to be played to maximize its gain. In this article, sharing femto 
access game will be denoted by the game restricted to SRCs. 

The game restricted to SRCs is defined as follows: given a fixed SPC's bandwidth split (into green part 
and yellow part), what would be the best strategy to be played by SRCs in order to have a stable situation 
where the strategy of each SRC player is optimal for him considering the other SRCs strategies. This 
situation corresponds to a pure Nash equilibrium in game theory. Recall that a pure Nash equilibrium of a 
game is a situation where, for each player, there is no unilateral strategy deviation that increases its utility 
0. 

3.1 Game restricted to SRCs 

As mentioned in the previous section, the game restricted to SRCs assumes that the SPC's bandwidth split 
is fixed. Let Bg be the total bandwidth the SPC is agree to share and let \£ $ be the proportion of Bs G 
regarding B$. 

3.1.1 SRCs QoS needs 

Each SRC's QoS needs depend on the type of application he requests. Requesting for femto access is 
equivalent to request an amount of bandwidth. Fixing this amount of bandwidth depends on the following 
parameters: the type of application (real time, elastic), the QoS parameter that the applications requires 
(delay, time transfer file, . . . ), the type of connection (Green or Yellow) and the SRC's profile (QoS sensitive 
SRC or price sensitive SRC). 

Our work takes into account the File Transfer Application. QoS is defined as the time transfer file that 
we will denote by t. The SRC's QoS satisfaction is related to t. For each SRC, the time transfer file should 
be between T\ and T2 and is defined as follows: 

• Case t = T\\ BWuax corresponds to the required bandwidth to download a file in t = T\. If the SPC 
provides an amount of bandwidth equal to BWuax, then the SRC's QoS satisfaction is at the top. 

• Case t = T2: BWuin corresponds to the required bandwidth to download a file in t = T2. If the SPC 
provides an amount of bandwidth equal to BWuin, then the SRC's QoS satisfaction is minimal. 

Each SRC will request for a minimum amount of bandwidth and a maximum amount of bandwidth in Green 
and Yellow depending on its profile and on its user equipment Signal-Strength towards the SPC's femto cell. 

All the SRCs requests can not be accepted. In fact, since the SPC's bandwidth is limited and since 
several SRCs could request for the SPC's connection at the same time, one possible response that a SRC 
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Figure 2: Bandwidth request illustration for a QoS sensitive SRC's strategy (a = 0.7, e = 0.1, and n = 0.1). 

could receive is a deny one. Requesting for a Minimum and a Maximum amount of bandwidth will decrease 
the chances to receive such a response. Besides, requesting a bandwidth interval generalizes the fact of 
requesting a fixed amount of bandwidth. In this way, a SRC could receive an amount of bandwidth which 
may be different from its optimal request but would avoid him to have no payoff. 

The minimum and the maximum amount of bandwidth are fixed depending on the SRC's profile. Note 
that the parameters caracterizing a SRC's profile are real in [0, 1]. We aim at translating these parameters 
into intervals of bandwidth requests which are actually integers. So, we introduce a parameter e representing 
the discretization of the bandwidth resquested. Let SsrcJ ^ the set of possible strategies of SRCi. In the 
following, we focus on the request of a SRC denoted by SRCi according to its profile. 

1. We consider the case where SRCi has its QoS sensitive parameter a, greater than 1/2. SRCi fixes 
a revenue threshold under which he denies any proposed connection (high QoS degradation). This 
threshold denoted by Rev_Thi corresponds to a minimum amount of bandwidth to be requested. This 
parameter depends on the QoS sensitivity on of the SRC. Rev-Thi = on — k where k G [0, 1] is 
the allowed variation from the QoS degradation tolerance fixed according to the SRC's profile (more 
specifically a>i). 

Ssrc, = {Rev.Thi, Rev.Thi + e, Rev.Thi + 2s, . . . , 1}. 



2. We consider the case where SRCi has its QoS sensitive parameter oti less than 1/2. This implies that 
its price sensitive parameter is greater than 1/2. SRCi fixes a cost threshold denoted by Cost-Thi 
above which he denies any proposed connection (the cost is beyond what he is able to pay). This 
threshold corresponds to a maximum amount of bandwidth to be requested. This parameter depends 
on the QoS sensitivity of the SRC and is defined as follows: Cost-Thi = cti + k. 

SsRC t = {Cost_Thi, Cost-Thi — e, Cost-Thi — 2e, . . . , 0}. 
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Figure 3: Bandwidth request illustration for a price sensitive SRC's strategy (a = 0.3, e = 0.1, and 
K = 0.1). 



Each element Si of SsRd permits to define an interval of bandwidth to be requested in Green and 
Yellow connection. This represents a couple = [mf , Mf],yi = [mj ,Mf]) of couples of integers. 
The parameters mf , Mf- represent respectively the minimum and the maximum amount of bandwidth to 
be requested in X connection where X G {G, Y}. They are defined as follows : 



1. Case where SRCi has its QoS sensitive parameter on greater than 1/2: 

(a) mf = BW ™ a * and mj = BW ™°-* x (1 - 6) 

(b) Mf = BW max and Mj = BW max 



2. Case where SRCi has its QoS sensitive parameter aj less than 1/2. 

(a) Mf = and Mj = X (1 - S) 

(b) mf = BW min and mf = BW min 

Figure [2] highlights an example of a QoS sensitive SRC (a = 0.7) and Figure [3] gives an example of 
a price sensitive SRC (a = 0.3). They show the minimum and the maximum amount of bandwidth to be 
requested in Green and Yellow for one of the SRC's strategies. 

3.1.2 SPC's bandwidth allocation 



Each SRC sends a request to the SPC. At reception, the SPC decides the way its bandwidth is allocated 
to SRCs. The request of each SRCi is represented by one element in SgRd- According to a set II of SRCs 
requets II =< s\, s%, . . . , Sjy > where corresponds to the request of SRCi, for any i, 1 < i < N, SPC 
gives an answer to each SRCi represented by a triple (Gi, Yi, bwi) defined as follows: 

• bwi represents the amount of bandwidth given by the SPC to SRCi. 
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• Gi = 1 (resp. Yi = 1) means that SRCi has received a Green (resp. Yellow) connection from the 
SPC. Note that the case where d = 1 and Yi = 1 is not possible. 

• If Gi = and if Yi = 0, then SRCi has received no connection from the SPC. 

Let config(H) be the set of all answers (one answer per SRC) of the SPC to n. In other words, 

config(U) =< (Gi,Yi,bw{), (G2,Y 2 ,bw 2 ), (G N ,Y N ,bw N ) > 

The answers respect the two following properties. Let s, be the corresponding bandwith request (gi = 

[mf, Mf\,yi = [mf, Mj]) of SRCi. 

1. The SPC gives SRCi an amount of bandwidth equal to bwi where bwi is in the interval requested. 
More formally, if (Gi = 1) then bw-i G gi or if (Yi = 1) then bwi G 

2. The SPC provides bandwidth to SRCs in the limits of its bandwidth availability in Green and Yellow. 
Thus: 

E£o G * x bw i < *sB s and £,= Y t x 6^ < (1 - s )£ s . 
Moreover, the SPC allocates its bandwidth in a way maximizing this outcome function: 

U S P C (config(U)) = £ Pro P (b Wi )^ - T) (d + Y^l - 5)) (1) 

i 

Note that the SPC's outcome function considers only its own profile described in Section[2.1.1| Given 



a SRC strategy set, the SPC allocates to each SRC a connection in Green and Yellow such as its outcome 
function is maximized. 

3.1.3 SRC Game Definition 

Game theory models the interactions between players competing for a common resource. In our system, 
the formulation of this noncooperative game G = (M, S, Uk) can be described as follows: 

• The set of players is M. Each player is a SRC. There are N SRCs. 

• The space of pure strategies S formed by the Cartesian product of each set of pure strategies 

S = Ssrci x SsRc\ x ... x Ssrc n 



Note that for each SRCi, the set SsRd is defined in Section 3.1.1 A pure strategic Si is a value 



corresponding to the request which is a couple (gi = [mf , M^],yi = [mf ,M[\) of couples of 
integers. 

A set of utility functions {U\, C/" 2 , Un} that quantifies the players' preferences over the possible 
outcomes of the game. According to a set II of SRCs requets II =< s±, S2, ■ ■ ■ , szv > where Sj 
corresponds to the strategy of SRCi, for any i, 1 < i < N, the SRCs utilities are determined through 
the SPC's allocation. Since several allocation decisions could maximize the SPC's outcome function 
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given by Equation ([T]), the SRC's utility corresponds to a mean of all these allocation decisions. Let 
sol(H) be the set of config(H) that maximizes the SPC's outcome function with M = \sol(TL)\. 

The utility Ui(H) of SRCi from sol (II) is expressed as follows: 

cesoi(n) 

The gain of SRCi from the SPC's allocation decision c in sol (II) is expressed as follows: 

gairii(c) = Revi(c) — Costi(c) 

Where c represents a triple (Gi,Yi, bwi) to SRCi. 

- Revi{c) G [0, 1] represents the SRC's QoS satisfaction from the connection c provided by the 
SPC and is expressed as follows: 

Revi(c) = Rev(bwi)GiOti + Rev(bwi)Yi(l — 5)on 
where Rev(bw{) G [0, 1] 

- Costi(c) G [0, 1] represents the cost of the connection c provided by the SPC and is expressed 
as follows: 

CosU(c) = Cost(bwi)Gi(3i + Cost(bwi)Yi(l - 5)(3i 
where Cost(bwi) G [0, 1] 
We normalize the utility of SRCi in order to have t/j(II) G [0, 1]. 

3.2 SRC game equilibrium 

Since each SRCi has a finite set of strategies, this game has a mixed Nash equilibrium [5 ]. In the following, 
we study the existence of a pure Nash equilibrium in the sharing femto access game using the properties of 
potential games. The definition of potential game could be found in [ 8 ]. 

theorem 1 Each instance of the game restricted to SRCs admits at least one pure Nash equilibrium. 
The proof of this theorem is detailed in O. 

Sketch of proof: The main idea of this proof is to show that the Best Response Dynamic in this game 
converges to a pure Nash equilibrium. The Best Response dynamic algorithm corresponds to a sequence of 
profiles computation. Let C be an arbitrary profile. If no player has incentive to change its strategy in C, 
then C is a pure Nash equilibrium and this algorithm stops. If at least one player has incentive to change its 
strategy in C, then this player changes its strategy by choosing its best response and thus we move to a new 
profile C. Then, the algorithm applies the same process for C and so on. If some players have incentive to 
change their strategy, this means that the SPC reduces its non reserved bandwidth. Thus the Best Response 
Dynamic algorithm will end up since at each step the free SPC's bandwidth is reduced. 
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Figure 4: Algorithm principle of the game restricted to SRCs 

4 Learning the Game restricted to SRCs' Nash equilibrium 

In Section [3] we have proved that the game restricted to SRCs admits at least one pure Nash equilibrium. 
Now, we want to know whether a distributed algorithm (each player knows only its local information) could 
converge to a pure Nash equilibrium. 

4.1 Algorithm Principle 

In the following section, we will assume that we are in a distributed context. So, each SRC player will, 
based only on its local information, learn the strategy maximizing its gain. 

For a fixed ^5, we will apply the steps from 2 to 5 of the algorithm presented in Figure [4] 

2. Each SRC sends a request using a specific distributed algorithm denoted by Adist- 

3. The SPC defines its decision sol(U). 

4. The SPC sends its decision to all SRCs. 

5. Each SRC will compute its utility following sol(H). 

We will repeat all these steps till convergence of Algorithm Adist- We will denote by II* the SRC 
strategy profile for which Adist converges. Now, we will present the Algorithm Adist- In [3], it has been 
proved that if the considered game has at least one pure Nash equilibrium and if there exists a sufficiently 
small value of the learning speed parameter for which the distributed learning algorithm converges, then 
the point of convergence of this algorithm is a pure Nash equilibrium. We say that this algorithm weakly 
converges to a Nash equilibrium. 

4.2 ALGORITHM A dis t 

Algorithm Adist is a decentralized learning algorithm of Nash equilibrium in multi-person stochastic games 
with incomplete information. The principle of Adist is described as follows: 

We consider that the strategic process for each SRC player follows a discrete learning technique. 
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• Each SRCi will update its strategy at a step t denoted by s\ following its local mixed strategy (prob- 
ability distribution assigned to each available pure strategy). Each SRCi will thus compute its new 
probability vector s* +1 using only a set of local information (s*, margin*, gain*) where gain\ rep- 
resents the gain from the SPC's allocation decision and margin\ is the pure strategy in SsRd 

• At each step t, each SRC chooses randomly one strategy (margin*) and increases its probability of 
choosing this strategy(s*) according to its gain (gain*) and a learning parameter b € [0, 1] which 
modulates the learning speed of the different SRCs players. 

The learning technique is based on the following update rule: 

If j 7^ margin* then s*^ 1 = s| • — b.u*.s*j 

e ^ Se S i,j = S i,j ~ ^' U i' ^k^marginj s i,k 

Such that: u\ = ut % _ A t ' is the normalized utility. 

The variables (U* = maxk< t gain^) and {A\ = maxk<tgainf) correspond respectively to the max- 
imum utility and the minimum utility of SRCi at iteration t. Note that it is possible to consider 

A\ = 

The aim from using this update rule is the following: each SRC player lowers the probabilities associated 
to margins not played at the step t by the same percentage. The sum of these percentages is then added to 
the probability associated to the margin played at step t. 

As mentioned in Theorem[T] in the game restricted to SRCs, there is at least one pure Nash equilibrium. 
In this article, we will try to answer the following questions: Does A^ist converge? Is the point of conver- 
gence if any exists a pure Nash equilibrium? Is this algorithm reliable for Nash equilibrium computation? 

4.3 Simulations 

In our simulations we will consider N SRCs: each SRCi is characterized by a, (the QoS sensitivity param- 
eter of SRCi) and is requesting for the SPC's connection characterized by u to download a file. 

We will only focus on the cases of a same category SRCs and more specifically QoS sensitive SRCs. 
Indeed, this case is the hardest one to reach stability since the SRCs requirements are conflicting. In the 
other possible cases, Algorithm Adist converges. 

The simulations presented take into consideration an extremely gain sensitive SPC (i.e u = 1) and N 
QoS sensitive SRCs. The N SRCs want to download a file of 1Mbyte. We will consider T2 = 300sec, 
fi = l, 8 = 0.1, e = 0.1, k = 0.1, = 0.5 and B$ = 20Mb/ s. Since a femto access could support only 
8 communications, we will run simulations where 2 < N < 8 keeping the same input presented above. In 
this article, we will present the results only for 4 and 5 SRCs. The first simulation presents a scenario where 
SRCs have only two strategies (extremely QoS sensitive SRC^]>. In the second simulation, SRCs have more 
than two possible strategies. In our simulations, we consider the following strategy notation Sj = RevSThi 

4.3.1 Scenario 1 with 5 SRCs 



3 An extremely QoS sensitive SRC is a SRC with a — 1 
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Figure 6: Variation of SRCs expected gain in scenario 1 



For this scenario, we will as a first step analytically compute the game Nash equilibriums. To do so, we 
will consider the same steps presented in Figure [4] except that the SRC strategy profile is not generated with 
Algorithm A^ ist : we will consider all the possible SRC strategy profiles and thus fill the SRCs payoff matrix 
with utilities corresponding to each SRC strategy profile. Figure [5] highlights 10 pure Nash equilibriums 
circled in red. These Nash equilibriums are detected analytically through the SRCs payoff matrix since 
SRCs cannot rise utility by an unilateral deviation. Now, we will run the Algorithm A^ist ■ First, we 
will verify whether if it converges and then we will check whether if the point of convergence ,if any, is 
one among the pure Nash equilibriums detected analytically. To do so, we will consider b = 0.1 and 1000 
iterations. In Figure[6]and|9| each point represents the mean over 15 iterations of respectively SRCs expected 
gain value and SRCs strategy probability. 

Adist converges after 780 Iterations. Figure [6] shows that the SRCs expected gain stabilize as follows: 
Expected gain(SRCi)=Expected gain(SRC2)=Expected gain(SRC4)=0.45 and Expected gain(SRC3)=Expected 
gain(SRC 5 )=0.33. 

In Figure |9j we remark that the convergence point (point where each SRC has a pure strategy) II* = 
(1, 1, 0.9, 1, 0.9) matches with one of the pure Nash equilibriums computed analytically. 

For a number of SRCs varying from 2 to 8, we have checked by simulations that pure Nash equilibrium 
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Figure 8: Variation of SRCs expected gain for Scenario 2 

in the game restricted to SRCs is reachable keeping the same input as scenario 1 . 

The following table summarizes the number of iterations necessary for Adist to converge for N varying 
from 2 to 8. This result is true for SRCs with only two strategies. 



Number of SRCs 


2 


3 


4 


5 


6 


7 


8 


Number of iterations 


150 


300 


450 


780 


800 


810 


830 







Table 1 : Number of iterations for Adist convergence 



4.3.2 Scenario 2 with 4 SRCs 

In the following, we will check if the fact of having more than two possible strategies per SRC could 
effect the results. To do so, we will consider 4 SRCs defined as follows: a\ = 0.9, 02 = 0.7, «3 = 1 and 
a 4 = 0.9. 
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Figure 9: Variation of SRCs strategies probabilities for Scenario 2 



6 pure Nash equilibriums are found analytically in the SRCs utility matrix and are the following: 

IT = {(1, 1,0.9, 1,0.9); (0.9,0.9, 1, 1); (0.9, 1,0.9, 1); (0.9, 1, 1,0.9); (1,0.9,0.9, 1); (1,0.9, 1,0.9); (1, 1,0.9,0.9)}. 

For Adist algorithm, we will consider b = 0.1 and 700 iterations. In Figures [8] and [9} each point repre- 
sents the mean over 15 iterations of respectively SRCs expected gain value and SRCs strategy probability. 

Adist converges after 500 iterations. We notice in Figure [8] that the SRCs expected Gain stabilize as 
follows: 

Expected gain(SRCi)=Expected gain(SRC2)=Expected gain(SRC4)=0.45 and Expected gain(SRC3)=Expected 
gain(SRC 5 )=0.33 

Even with more than two strategies per SRC, Adist still converges to a Pure Nash equilibrium. In this 
simulation, II* = (0.9, 1, 1, 0.9) is a pure Nash equilibrium. Figure [9] shows that SRC\, SRC 2 , SRC S and 
SRC4 needs respectively 175, 200, 410 and 500 iterations to learn the strategy maximizing its gain. 

When all the SRCs have pure strategies, we reach a stable state for the game restricted to SRCs. In order 
to make the system representing this game converge more rapidly, we will propose two solutions: In the 
first one, we consider that the system is stable if all the SRCs have a strategy with a probability equal to p 
(0 < p < 1). 

If we fix p = 0.8, the software set up in the user equipment of SRC\, SRC2, SRC3 and SRC4 needs 
respectively 90, 90, 200 and 340 initiated connections to learn the strategy maximizing its gain. After 340 
iterations, the system is considered as stable. Thus, the number of iterations for Adist convergence is reduced 
by 32%. 

The second solution does not reduce the number of iterations for Adist convergence, but proposes to 
reduce its time duration (could also be seen as the number of requested connections). This solution is based 
on the fact of triggering the learning process several times per connection requested. This means that for 
a connection of duration D, the SRC's software will choose a new strategy following the updated strategy 
probability vector each d=q*D slot time (q € [0, 1]). Thus, the time duration (the number of requested 
connections) necessary for Adist convergence is reduced by q. 

We have focused on the convergence of Adist taking into consideration several strategies per SRC and 
2 < N < 8. We have found that Adist always converges to one among the pure Nash equilibriums detected 
analytically in the SRCs payoff matrix. To conclude, Adist permits to learn the game restricted to SRCs' 
pure Nash equilibrium whatever the SRCs profiles are. 
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4.4 Simulation results 



Simulations results have shown that the game restricted to SRCs using a distributed learning algorithm con- 
verges to a pure Nash equilibrium. We have checked that this result is available for a number of SRCs 
varying from 2 to 8 for SRCs with exactly 2 strategies or more than 2 strategies. The distributed algorithm 
converges in a finite number of iterations. Two solutions has been proposed to reduce the number of re- 
quested connections necessary for this algorithm convergence. The distributed learning algorithm running 
in SRCs user equipments gives at convergence the minimum and the maximum amount of bandwidth to be 
requested by each SRC in Green and Yellow. 

5 Conclusion and perspectives 

In this article, we investigate the problem of sharing femto access taking a file transfer application as exam- 
ple. Only the competition between SRCs is modeled as a game considering a fixed SPC's bandwidth split. 
Our simulations focus on examples where SRCs objectives conflict: SRCs belonging to the same category 
of QoS sensitive SRCs. 

Simulations have proved the efficiency of the Distributed Learning Algorithm: even if each SRC player 
has only local information, the algorithm always converges to a pure Nash equilibrium. 

As a perspective, we will consider the sharing femto access game with several SRCs and several SPCs. 
The sharing femto access problem will thus be divided into two levels: a first level representing a game 
restricted to SRCs and a second level representing a game restricted to SPCs. We will study the properties 
of the second level game to check whether if they match with the game restricted to SRCs' properties. We 
will also focus on the convergence time optimization of the algorithm Amst applied on both SRCs game 
and SPCs game. 
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